Using MLP Features in SRI’s Conversation
نویسندگان
چکیده
We describe the development of a speech recognition system for conversational telephone speech (CTS) that incorporates acoustic features estimated by multilayer perceptrons (MLP). The acoustic features are based on frame-level phone posterior probabilities, obtained by merging two different MLP estimators, one based on PLP-Tandem features, the other based on hidden activation TRAPs (HATs) features. This paper focuses on the challenges arising when incorporating these nonstandard features into a full-scale speech-to-text (STT) system, as used by SRI in the Fall 2004 DARPA STT evaluations. First, we developed a series of time-saving techniques for training feature MLPs on 1800 hours of speech. Second, we investigated which components of a multipass, multi-front-end recognition system are most profitably augmented with MLP features for best overall performance. The final system obtained achieved a 2% absolute (10% relative) WER reduction over a comparable baseline system that did not include Tandem/HATs MLP features.
منابع مشابه
Incorporating Tandem/hats Mlp Features into Sri’s Conversational Speech Recognition System
We describe the development of a speech recognition system for conversational telephone speech (CTS) that incorporates acoustic features estimated by multilayer perceptrons (MLPs). The acoustic features are based on frame-level phone posterior probabilities, obtained by merging two different MLP estimators, one based on PLP-Tandem features, the other based on hidden activation TRAPs (HATs) feat...
متن کاملThe efficient incorporation of MLP features into automatic speech recognition systems
In recent years, the use of Multi-Layer Perceptron (MLP) derived acoustic features has become increasingly popular in automatic speech recognition systems. These features are typically used in combination with standard short-term spectral-based features, and have been found to yield consistent performance improvements. However there are a number of design decisions and issues associated with th...
متن کاملSignificant Attributes of Conversation Analysis in Social Interviews
This paper aimed at manifesting the role of significant attributes of conversation analysis in social interviews from a functional perspective. In this paper the concern would be on how these attributes of conversation analysis acted as important elements in analyzing the interviews. In particular, the author discussed some samples of B.B.C learning English interviews which considered some conv...
متن کاملSRI November 1993 CSR Spoke Evaluation
In this paper we present SRI’s results on the 1993 ARPA CSR Spoke Evaluations. This evaluation used the same HMM acoustic models as those used in SRI’s hub system: gender-dependent Genonic HMM’s. The system was made robust by modifying the front end algorithms to estimate the cepstral features (the HMM models were not modified). The robust front-end used a wide bandwidth (100-6400Hz) and estima...
متن کاملSpeech-to-text development for Slovak, a low-resourced language
Development of an automatic speech recognition (ASR) system for low-resourced languages is an important research topic in ASR. This paper reports on the development of a speech-to-text (STT) system targeting broadcast news and broadcast conversation transcription for the low-resourced Slovak language. Context-dependent acoustic models are trained without any manually transcribed audio data via ...
متن کامل